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Abstract 

The Canadian traveler problem (CTP) is the problem of traversing a 
given graph, where some of the edges may be blocked - a state which is 
revealed only upon reaching an incident vertex. Originally stated by Pa- 
padimitriou and Yannakakis (1991), the adversarial version of CTP was 
shown to be PSPACE-complete, with the stochastic version shown to be 
#P-hard. 

We show that stochastic CTP is also PSPACE-complete: initially prov- 
ing PSPACE-hardness for the dependent version of stochastic CTP, and 
proceeding with gadgets that allow us to extend the proof to the indepen- 
dent case. 

Since for disjoint-path graphs, CTP can be solved in polynomial time, 
we examine the complexity of the more general remote-sensing CTP, and 
show that it is NP-hard even for disjoint-path graphs. 

Keywords: Canadian Traveler Problem, Complexity of Navigation under 
Uncertainty, Stochastic Shortest Path with Recourse 



1. Introduction 

In the stochastic Canadian traveler problem (CTP) fT] we are given an 
undirected connected weighted graph G = {V.E), a source vertex (s G y), 
and a target vertex {t &V). Any edge e E E may be blocked with a known 
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probability p{e). The actual state of each edge e G £" becomes known only 
upon reaching a vertex incident on e. Traversing an unblocked edge e incurs 
a non-negative cost equal to the weight of e. The problem is to find a policy 
TT that minimizes the expected traversal cost C(7r) from s to t. 

CTP formalizes a basic question of navigating in a partially known en- 
vironment, which is a fundamental task for transportation, autonomous 
robotic systems, computer games, and more. Other variants of CTP have 
been introduced and analyzed in the research literature [21 El H]. There 
has been a strong recent resurgence of interest in CTP, both theoretical 
[HI E] and empirical [3 El E] • A preliminary alternative proof of Theorem |2] 
appears in an unpublished work by one of the authors [lOj . 

When originally introduced in [1], two variants were examined: the ad- 
versarial variant and the stochastic variant. The adversarial variant was 
shown to be PSPACE-complete by reduction from QSAT. For the stochastic 
version, membership in PSPACE was shown, however only T^P-hardness was 
established by reduction from the st-reliability problem, leaving the question 
of PSPACE-hardness open. Apparently proving the stronger result requires 
some form of dependency between the edges, achieved "through the back 
door" in the adversarial variant. This paper settles the question, showing 
that CTP is indeed PSPACE-complete. 

Since the size of an optimal policy is potentially exponential in the size 
of the problem description, we in fact show that it is PSPACE-hard to find 
even the optimal first action at s. 

We begin with a variant of CTP with dependent directed edges, CTP- 
Dep, which allows for a simple proof of PSPACE-hardness by reduction from 
QSAT, before proceeding with the proof for the "standard" stochastic CTP. 
Although the latter result subsumes the former, proving the dependent CTP 
result first greatly simplifies the intuition behind the proof of the standard 
case. 

Another variant we explore is remote-sensing CTP, henceforth called 
Sensing-CTP, in which additional actions called remote- sensing actions are 
allowed. Each such action reveals, for a certain cost, the status of a non- 
incident edge. Recently it was shown ^ that stochastic CTP can be solved 
in low-order polynomial time on disjoint-path graphs. It was believed that 
generalizing CTP to allow remote-sensing actions makes the problem harder 
- indeed we show that allowing remote-sensing makes CTP NP-hard even 
on disjoint-path graphs. 
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2. Dependent directed CTP is PSPACE-hard 



This general form of dependent CTP (called CTP-Dep) is a 5-tuple 
{G, w, s, t, B) with G = (V, E) a directed graph, a weight function w : E ^ 
3?-", s,t & V are the start and goal vertices respectively, and a distribution 
model B over binary random variables indexed by the edges E. We assume 
that B is specified as a Bayes network over these random variables E [TT] 
as follows. Each random variable expresses the state (blocked, unblocked) 
of an edge in E (abusing notation we use the symbols indicating the edges 
to denote the respective random variables). The Bayes network {E,A,P) 
consists of a set of directed arcs A between the random variables E, so that 
{E, A) is a directed acyclic graph. P describes the conditional probability 
tables, one for each e E E. 

Theorem 1. CTP-Dep is PSPACE-hard. 

Proof, by reduction from QSAT [12]. Recall that QSAT is the language of 
all satisfiable quantified boolean formulas (QBF), $ = Vxi3x2...V5(a;i, 0:2, x„), 
where (y9 is a boolean formula in conjunctive normal form, with n variables 
and m clauses, which contain literals, each is consisting of either a variable 
or a negated variable. We assume that each clause has at most 3 literals. 
Given a QBF $, construct a CTP-Dep instance {G<s,,w, s,t, B) as follows 
(see Fig. [T]). consists of a variables section, and an exam section. Ver- 
tices in the variables section have labels starting with v or o, and vertices of 
the exam section begin with r. An always unblocked edge {s,t), called the 
default edge, has a cost of h. All other edges, unless mentioned otherwise, 
are zero-cost edges known to be unblocked. 

The variables section contains a subsection Xi for every variable Xi, 
which begins at Vi, and ends at v'^. For every i < n, Xi is connected to Aj+i 
through an edge (w^,Wj+i). 

Every Aj contains a true-path [vi, Vn, - ■ ■ , f v'^), and a false-path (t>j, vn, ■ ■ ■ , Vim, v'^. 
If Xi is a universal variable (resp. existential variable), the edges {vi,Vii), 
and {vi,Vii) are called universal edges (resp. existential edges). While the 
existential edges are always unblocked, we set the universal edges to have 
have blocking probability 1/2 and to be mutually exclusive: for each uni- 
versal variable Xi, exactly one of {vi,Vii), and (fj,Wii) is blocked. 

In addition, for every 1 <i <n, and 1 <l <m, there are edges {ou, vn), 
and {oii,Vii), called observation edges. These edges are only meant to be 
observed, as their source vertices are unreachable. Every observation edge 
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Figure 1: Reduction from QBF to CTP-Dep. Note that vertex t appears twice in order 
to simplify the physical layout. 

is blocked with probability 1/2, and the dependency of the observation edges 
is defined according to appearance of variables in the clauses of $, as follows: 
an observation edge {oiuVu) (resp. {oiuVu)) is considered "in" a clause Ci 
if Xi appears unnegated (resp. negated) in clause C/. All observation edges 
that are "in" the same clause Ci co-occur: they are either all traversable or 
all are blocked (with probability 1/2, as stated above), independent of all 
other edges that are not "in" Ci. 

The exam section consists of an odd-path (rg, r^, r^, t), and an even-path 
(ro, r2, Tg, t). In addition construct edges (ri,i:), and (7*2, t) with cost 1. The 
edges (ri,r'j^), and (r2,r2) are called choice edges. The edge (ri,r'^) (resp. 
('"2,^2)) is unblocked if and only if the observation edges are unblocked for 
an odd (resp. even) number of clauses. Hence exactly one of the choice edges 
is blocked If at least one observation edge in each clause is observed, the 
status of the choice edges can be determined with certainty. Otherwise the 
posterior blocking probability of each choice edge remains 1/2. In order to 
prove the theorem, it is sufficient to prove the following claim: 

Claim 1. An optimal policy has expected cost just when $ is satisfiable 



-'^Note that as every clause has at most three literals, this dependency structure can be 
realized with a Bayes network of constant in-degree, a construction that has polynomial 



size. 
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(in which case the optimal first action is to traverse {s,vi)). Otherwise (for 
any h < 2~^~^) the optimal policy is to traverse {s,t) with a cost of h. 

Proof. Suppose first that $ is satisfiable. Then there is a policy for as- 
signing values to all the existential variables, each given every setting of the 
enclosing universal variables, such that if is true. Following this policy for 
each existential variable Xj, i.e. traversing edge if Xi should be true, 

and {vi,Vii) otherwise, leads (by construction) to following a path such that 
at least one observation edge is seen in every clause. Hence, the "exam" is 
passed (i.e. the 0-cost unblocked path in the exam section is chosen) with 
certainty. 

Next, suppose $ is not satisfiable. Then there is at least one setting 
of the universal variables for which some clause Ci is false under the same 
conditions, and thus no edge "in" clause C is observed. Since every setting 
of these variables occurs with probability 2~2 (assuming w.l.o.g. that n is 
even), in these cases the exam is "flunked" (picking the path where only 
the expensive edge is unblocked) with probability 1/2, and thus the total 
expected cost of starting with {s,vi) is at least 2^2^'^, Hence, setting h < 
2^ 2^1, the optimal policy is to traverse {s, t) if and only if $ is not satisfiable. 

□ 

3. Complexity of CTP 

Having shown that CTP-Dep is PSPACE-hard, we extend the proof to 
the "standard" stochastic independent undirected edges CTP: 

Theorem 2. CTP is P SPACE- complete. 

In order to prove Theorem [2| we use the same general outline of the 
reduction from QBE as in the proof of Theorem [T] However, in CTP- 
Dep, dependencies and directed edges restrict the available choices, thereby 
simplifying the proof. Here we introduce special gadgets that limit choice de 
facto, and show that any deviation from these limitations is necessarily sub- 
optimal. Policies that obey these limitation are called reasonable policies. 
Each such gadget g has an entry terminal Entry {g), and an exit terminal 
Exit{g); an attempt to traverse g from Entry (g) to Exit{g) is henceforth 
called to cross g. The gadgets operate by allowing a potential shortcut to 
the target t; crossing these gadgets may either end up at Exit{g), with some 
probability q{g), or at t instead. The edges that allow direct access to t are 
called shortcut edges. 
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We introduce the gadgets in sections |3.1| and |3.2[ and the CTP-graph 



construction in Section 3^ The actual proof of Theorem |2] is in Section 



3.4[ In the description of the gadgets and CTP-graph, we sometimes add 
zero cost always traversable edges. These edges, which appear unlabeled in 
figures [2][3l and |4} were added solely in order to simplify the physical layout 
as a figure; any u, v connected by such an edge can be considered to be the 
same vertex. 

3.1. Baiting Gadgets 

A baiting gadget g = BG{u, v) with parameter L > 1 is a three-terminal 
weighted graph (see Fig. [2]): an entry terminal u = Entry (g), an exit 
terminal v = Exit{g), and a shortcut terminal which is always t. The latter 
terminal is henceforth omitted in external reference to g, for conciseness. 

The baiting gadget consists of + 1 uniform sections of an undirected 
path (m, fi, ■ ■ ■ , vn, v) with total weight L, each intermediate vertex has a 
0-cost shortcut to t with a blocking probability 1/2. In addition, there is a 
shortcut edge with cost L from the terminals v to t. Set = 2l^'°S2(4i)l _]^_ 
We assume that g is connected to the graph such that any policy executed 
at u, in which the edge {u,Vi) is not traversed, has an expected cost of 
at least 1. Later on we see that this assumption holds in the CTP-graph 
construction. 

Let TT be the following partial policy: when at u for the first time, proceed 
along the path {u, Vi, - ■ ■ ,vn, v) to v, taking the 0-cost shortcut to t whenever 
possible, but never backtracking to u. 

It is easy to show that even if we need to take the cost L shortcut at f , 
the expected cost of executing vr at u for the first time is less than 1. Because 
of the L cost shortcut edge the expected cost of any optimal policy 

once at v (knowing all 0-cost shortcuts are blocked) is no more than L, hence 
under any reasonable policy, g is not retraced. A similar argument holds for 
retracing to u from other locations along the path (u, fi, ■ ■ ■ , wat, v). Hence 
we have: 

Claim 2. When at u for the first time, vr is optimal for a baiting gadget 
BG{u,v) with a parameter L > 1. After reaching v, it is suboptimal to 
backtrack to u in g. 

Note that g is actually symmetric w.r.t. u,v. However, since by con- 
struction of the CTP-graph, every reasonable policy always reaches one 
designated terminal u first, we treat g externally as if it were directional. A 
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Figure 2: A Baiting Gadget BG{u,v) with a parameter L > 1. Edge label w\p denotes 
cost|blocking probability. The optimal policy at u is to cross the path {u,vi, ■ ■ ■ , vn, w), 
taking a shortcut edge to t whenever such an edge is found unblocked. After reaching v, 
retracing to u in g costs at least L. 



precise derivation of the parameters of baiting gadgets appears in Appendix 

EH 

3.2. Observation Gadgets 

An observation gadget g = OG{u, v, a) is a four-terminal weighted graph 
(see Fig. |3|: an entry terminal u = Entry (g), an exit terminal v = Exit{g), 
an observation terminal o, and a shortcut terminal (again omitted in exter- 
nal references) which is always t. The observation gadget begins with a bait- 
ing gadget BGi = BG{u,vi) with parameter L > 8, which is connected to 
the "observation loop" beginning with a baiting gadget BG2 = BG{vi,V2) 
with parameter 3L/2, a zero-cost edge (^2,^3) with blocking probability 
3/4, and a cost Li = 5L/8 unblocked edge to 0. An always unblocked 3L/2 
shortcut edge {v2,t) is assumed. The observation loop is closed by a cost Li 
unblocked edge to V4 and a zero-cost edge (f4,fi) with blocking probability 
3/4. From vi, a cost 1 unblocked edge (fi,f') followed by a baiting gadget 
BG3 = BG{v[, u) with parameter L completes the gadget. 

We assume that is either not directly connected to the rest of the 
graph, or connected through a path (r2, rs, r4, rs, r'^^, rg) called the exam 
section path (o is identified with r^) with the following properties: the 
edges (r2,r3), {r'^^r^) and (r4,r5), have zero cost and blocking probability 
Pi, where pi > 1 — 2/(3L + 1). (r2,r3) and (?"'i,T2) are called guard edges, 
(r4,r5) is called an observation edge. The edges {r^^r^) and {r^^r'^) are 
always traversable edges with cost 1. The vertex is allowed to coincide 
with observation terminals of other observation gadgets. The notations of 
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Figure 3: An Observation Gadget OG{u, v, o). Light gray arrows indicate general traver- 
sal direction of the optimal policy tt. 



the exam section path are chosen to match the description of the CTP-graph 
construction in Section 13.31 

Let TTg be the following partial policy at g: when at u, cross BGi. Then 
(observing iyi.Vi)), cross BG2- If either {v2,vs) , or {vi,V4) is found blocked, 
reach t by traversing the cost 3L/2 shortcut edge {v2,t). However, if both 
{v2,Vs) and (ui,W4) are unblocked, traverse {v2,V3,o,Vi,Vi,v[) (observing 
any edges incident on o such as the observation edge {r4,r^)), and cross 
BG3. 



Again, by construction of the CTP-graph (section 3.3), any policy at u 
other than crossing BGi results in a cost of at least 1. 

Claim 3. When at u for the first time, Hg is an optimal policy for g. 

Proof Outline. Properties of the baiting gadgets ensure that g is tra- 
versed in the correct order. The guard edges (r2, r^) and {r[,r'2) ensure that 
it is suboptimal to "escape" from by traversing edges in the exam section. 
The uncertain edges (^4, vi) and (f 2, fs) ensure that it is suboptimal to enter 
a previously uncrossed observation gadget from 0. Likewise for a previously 
crossed observation gadget g': entering g' through is suboptimal because 
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all the baiting gadgets in g' have been crossed and observed to contain no 
unblocked zero-cost shortcuts. 

Detailed derivation of the properties of observation gadgets appears in 
Appendix A.2 

3.3. CTP-graph Construction 




Figure 4: CTP-graph construction for $ = '^Xi3x2 • • • (xi V X2) A (xi V X2) ■ ■ ■ ■ BG - 
a baiting gadget. OG - an observation gadget. Light gray arrows indicate the general 
traversal direction of the optimal policy when $ is satisfiable. 

Having shown the properties of the baiting and observation gadgets, we 

are ready to construct the CTP-graph: For a QBF $ with n variables and m 

clauses, we construct Gq, in the same general outline as the construction of 

the CTP-Dep graph (see section [2]) with the following changes (see Fig. |4]). 

The exam section is a path of 5(m-|-l)-|-l vertices < i < m+l, 1 < j < 

5} as follows. For every < z < m + l, {r\,rl), {rl,r'^^) and {rl,rl) have zero 

cost and blocking probability pi, apart from (r™"*"^, r^"*"^) which has zero cost 

and is always traversable. {r\,r2), and {r2,r''^) are called guard edges, and 
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{rl,rl) is called a clause edge. {rl^,rl) and {rl,r\'^'^) are always traversable 
cost 1 edges. In addition, the exam section holds an additional vertex ro, 
an always traversable cost 1 edge (ro,r|), and an always traversable cost 
L edge (ro,t). In order to guarantee correct operation of the observation 
gadgets, we disallow reasonable policies to traverse exam edges too early 
while crossing the variable section. This is done by visiting the initially 
uncertain guard edges only later via a section called the guards section, 
which consists of a sequence of baiting gadgets BG{zi, -Zj+i) with parameter 
L that visit r2 for all < z < m + 1. 

The variables section is constructed as for CTP-Dep, except that the 
directed edges are replaced by baiting gadgets BG{v'^,Vi+i) with 

parameter L. For each universal variable Xi the universal edges {vi, vn), and 
{vi,Vii) are cost 1 edges with blocking probability 1/2. For each existential 
variable Xj, the existential edges (f j, Vn), and (f j, Vn) are always traversable 
edges with cost 1. Inside each true-path, every {vij,Oij), (f j^, fj^+i)) pair 
is replaced by an observation gadget g = OG{vij,v[j,Oij). (f^j, are 
always unblocked zero-cost edges added for clarity. The observation vertex 
Oij is identified with the vertex incident on the appropriate clause edge in 
the exam section. That is, if Xj appears unnegated in clause j, then Oi. of 
the true-path is identified with in the exam section. Likewise respectively 
for all the edges in the false-paths. 

For example. Fig. |4] demonstrates the reduction for $ = Vxi3x2 ■ ■ ■ {xi V 
X2) A [xi V X2) ■ ■ ■ . The variable xi appears negated in clause 2, so in 
the vertex 012 at the section Xi, and the vertex r| of the exam section are 
connected by an unlabeled edge, hence the clause edge 62 = {rl,rl) can be 
observed from the observation gadget OG{vi2,v[2,oi2) when traversing the 
false path of Xi. Likewise, the connection of other observation gadgets can 
be explained similarly. 

3.4. Proof of Theorem^ 

Giv en a QBF $ with n variables and m clauses, we construct G$ as in 
Set L = 8m + 16 and pi = 1 - 2~V°^^'''^^'\ . We show that it 



3.3 



Section 

is optimal to traverse (s, vq) if and only if $ is satisfiable. 

Unless stated otherwise, we henceforth consider only reasonable policies 
for that do not begin with the default action of traversing (s, t). Due to 
properties of the gadgets (claims |2| [3]) any reasonable policy vr for G$ must 
follow the restrictions in Table [l| as any other action is suboptimal. 

Most of these restrictions are immediate consequences of executing opti- 



mal policies at the baiting and observation gadgets (see Appendix A.l and 
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Tabic 1: Reasonable policy actions in tt 



Location 


Action 


v[, for i < n 


cross BG{v'-, f i+i) 


Vj, for i < n, 


go to Vii or Vii 


Vii, for i < n, 


cross OG{vii,Vii,ou) 


Vii for i < n, 


cross OG{vii,Vii,dii) 


zi,ior <l <m + 2 


cross BG{zi, zi-i) 


ro 


pass exam or take shortcut 



Appendix A. 2 for detail). The following claim, proved in Appendix B.l 



shows the actions of any reasonable policy for G^ at Tq. 
Claim 4. At vq, any reasonable policy acts as follows: 

• If all the edges in the exam section were observed to be unblocked, 
cross (ro, ■ ■ ■ , r™"*"^, t) until reaching t for a cost of 2{m + 1). 

• Otherwise, cross the cost L shortcut edge {rQ,t). 



Therefore, reasonable policies for G$ differ only in the choices made in 
the universal and existential edges, and in the choice at tq which is either 
to traverse the exam section if all clause edges were observed, or otherwise 
take the expensive shortcut (ro,t). 

Now let 77 be a reasonable policy for and denote the expected cost of 
TT by C{7c). Define a weather to be an assignment of {traversable, blocked} 
to the edges of G-j,. Let W be the set of all possible weathers for G^, and 
foTwEW let pw be the probability that weather w occurs. Define C{7i,w) 
to be the cost of executing vr over a weather w. Then 

Cin)= J2P^Cin,w) (1) 

Next, partition W into full-trip weathers W-^ln), in which tq is reached 
while executing tt; and shortcut weathers (tt) in which ro is not reached 
due to taking a shortcut edge to t before reaching tq. Then: 

'^(^) = Yl P^C{n,w)+ PwCiTr,w) (2) 
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Let TT^ be a policy for 6*$ such that in every subsection Xj of the variables 
section, whenever possible, the true-path is always chosen. Define: 

Dst= Yl P-'Ci7r^,w) (3) 

As all the true-paths, and false-paths of all the variables section are symmet- 
ric in the number of observation gadgets and other edges, there is a bijection 
g.„ : VV^^n) — > ^^'^(vr-^) such that Pw = Pg^{w) and C(7r,w) = C^n'^ , gT,{w)) 
for every w G ^^^(vr). Hence we have: 

Dst= ^ p^C{7r,w) 

and 

C{n) = Dst+ Yl P-C'(7r,w) (4) 

Again, due to symmetry, and the properties of the baiting and observation 
gadgets (claims [2| [s]), the total cost from s to tq while executing vr in any 
weather w G W^{Tr) is independent of w. We denote this cost by Dpt, 
and can compute it simply by summing the cost of traversing from s to tq 
through the variable section and guard section, assuming that tq is reached. 
Then we get: 

19mL + 4 

Dpt = l + {2 + )n + {n + m + l)L (5) 

Then from tq to t the cost is either 2(m + 1) (if the exam section is known 
to be completely unblocked), or L > 2(m + 1) (taking the shortcut (ro,t), 
if some edges in the exam section are known to be blocked, or some such 
unknown edges remain). Hence for any full-trip weather w, C(7r, w) is either 
Dpt + L, or Dpt + 2{m + l). 

Let G [0, 1] be the probability that not all the clause edges of the 
exam section were observed in a full-trip weather by following vr (this prob- 
ability depends on the formula $). Then, with probability (1 — -P$)(l — 
p^-^3m+2 g^jj ^YiQ edges of the exam section were observed and were found 
unblocked before reaching tq. Let P^t = (1 ~ j9^)3m+2 ^^^q probability 
that all the edges in the exam section are unblocked and denote by P^o the 
probability of reaching ro by executing vr. Again, due to symmetry of the 
baiting and observation gadgets, P^^ is independent of vr. We get: 
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And therefore 

C(7r) = D,t + PnADpt + P^L + (1 - P^)(P.,2(m + 1) + (1 - Prt)L)) (7) 

If $ is satisfiable, then, as in the proof of Theorem[T| there is a reasonable 
pohcy TT which follows the assignments that satisfy $, thus every clause edge 
is observed, and P^ = 0. Define Bq = C(vr) for such a policy vr when $ is 
satisfiable. Then 



Bo = D,t + PnXDpt + Prt2{m + 1) + (1 - Prt)L) (8) 

If $ is not satisfiable, then at some universal subsection Xj of the vari- 
ables section, there is a probability of at least 1/4 that a universal edge 
must be traversed, such that upon reaching ro, not all the clause edges are 
visited. Hence, in total, there is a probability of at least (|)2 that not all 
the clause edges are visited. Note that as Pr^ already excludes events where 
both universal edges are blocked for some variable, if $ is not satisfiable, 
then for every reasonable policy tt, > (^) 2 . Hence define Bi as follows. 



Pi = + P,„(Ppi + (^)tL + (1 - (^)t)(P,,2(m + 1) + (1 - Prt)L)) (9) 

Then Pi > Pq, and if $ is not satisfiable, then C(7r) > Pi. Now let 
h = w{{s,t)) = Po + (|)2mPr(,, so that Bi > h > Pq. Thus the optimal 
action at s is to traverse (s, t) if and only if $ is unsatisfiable. Since the 
CTP-graph construction used a polynomial number of vertices, and all the 
weights and probabilities by construction need only a polynomial number 



of bits (see Appendix B.2 for the technical computation of /i). Theorem^ 



follows. □ 
Several corollaries follow due to the construction of 

Corollary 1. It is PSPACE-hard to determine the expected cost of the op- 
timal policy in CTP. 

By replacing all the edges with appropriately directed edges, we get: 
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Corollary 2. CTP with directed edges but no dependencies is PSPACE- 
complete. 

Finally, as every unknown edge in this construction of G$ has cost 
and a probability which is a power of 2 of being unblocked (the universal 
edges, for example, can be split into a two-edge path), we can replace every 
unknown edge with a path of zero-cost, blocking probability 1/2 edges and 
get: 

Corollary 3. CTP remains PSPACE-complete even if all the unknown 
edges have zero cost and blocking probability 1/2. 

4. Complexity of CTP with remote sensing 

A somewhat more general version of CTP is Sensing- CTP. In this vari- 
ant, the state of graph edges can be remotely sensed at any time, paying 
a known cost. Formally, Sensing- CTP is defined exactly the same way as 
stochastic CTP, w.r.t. the graph, edge-blocking probabihties and weights, 
and source and target vertices (see Section [T]). In addition, a sensing cost 
function SC : V x E ^ 3ft"*" is given. An edge e, not necessarily incident 
on a vertex v, can be sensed for a cost SC{v,e) and as a result, the true 
state of e is revealed. The problem in Sensing-CTP is to find a policy which 
minimizes the total expected sensing cost plus travel cost from s to t. 

CTP is solvable in polynomial time when the graph consists of edge- 
disjoint paths which meet only at s and t [9]. This gives rise to the question 
whether Sensing-CTP is also tractable for disjoint paths. We show that this 
is not the case for Sensing-CTP unless P=NP. Again, since the size of the 
policy may be exponential in the size of the graph, we actually show that 
it is NP-hard to determine even the first action in an optimal policy. 

Theorem 3. Sensing-CTP is NP-hard even in disjoint-path graphs. 

Proof. By reduction from the NP-complete problem vertex cover (VC) 
[12]. Let G = (y,E) be an undirected graph, for which we need to decide 
if there is an 5 C \/ of size at most k such that every edge in E is incident 
on a vertex in S {S is called a vertex cover of size k). The idea of the proof 
is, informally, for a policy to benefit if it 'has sensed' all of the edges in a 
given VC instance, where sensing of all neighbors of a vertex can be done at 
some constant cost. By tweaking constants, these actions will be beneficial 
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if and only if it is possible to sense all edges in the VC instance by sensing 
the neighbors of at most k vertices. 

Construct the corresponding CTP-PATH graph G' = {V, E') as follows 
(see Fig. [s]). For each vertex v E V, construct a "vertex" node f{v) G V, 
an edge {s,f{v)) with a cost C defined below, and an infinite-cost edge 
(/(f), t). Also construct the "default edge" {s,t) with cost 4. The above 
edges are always traversable. 

Construct the "sensing path" , a path consisting of a 

"leader" always traversable edge Cq with cost L starting at s, and for 
each edge e E E, a. zero-cost edge /(e) in sequence, and finally an infinite- 
cost edge (m, t). The probability that each of the latter edges is blocked 
is e, a small positive number defined below, except for {u, t) which has 
infinite-cost, therefore is never traversable. Additionally, we have a two 
edge "uncertain" path {{s,x), {x,t)) where (s,x) is always traversable and 
costs 2, and (x, t) is traversable with probability 1/2 and costs 0. Note that 
the resulting graph G' consists only of edge-disjoint paths leading from s to 
t. 




Figure 5: CTP-graph for reduction from vertex cover 

The sensing cost from "vertex" node f{v) on edge /(e) is for all e 
incident on v in G. Sensing (x, t) costs from u. All other sensing costs 
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are infinite. We show that for parameter values defined below, the optimal 
policy is to immediately traverse (s, t) if and only if G does not have a 
vertex cover of size k. 

Assuming no sensing, there are only two "reasonable" traversal policies: 
traversing (s, t) immediately for a cost of 4, or trying the uncertain path, 
which costs 2 if {x,t) is traversable, and 8 if not, for an expected cost of 
5. Thus the optimal policy with no sensing is to traverse {s,t). However, if 
sensing is allowed, the "value of information" of knowing whether [x, t) is 
traversable is: VOI{x, t) = 4 - (2 x 1/2 + 4 x 1/2) = 1, due to the fact that 
if {x,t) is revealed as traversable (which happens with probability 1/2) we 
gain 2 by taking the '"uncertain"' path instead of (s, t), and otherwise gain 
nothing. 

We show below that for appropriate values of L, it is not beneficial to 
try to get to u in order to sense {x, t) unless sensing reveals that the path 
to u is unblocked. The probability that at least one edge on the sensing 
path is blocked is at least e. However, all the edges in the sensing path can 
be sensed (stopping if any blocked edge is sensed) for a cost Csense defined 
below. If G has a vertex cover of size A;, then Cgense ^ 2CA; by visiting k 
"vertex" nodes. However, if the smallest vertex cover is of size k' > A; + 1, 
the expected sensing cost becomes: 

k' i-1 

Csense = ^CY,\{{^ ' ^Y' > '^Ck\l - 6)1^1 (10) 
i=l j=l 

where dj is the number of previously unsensed edges of the jth vertex in 
the (unknown) optimal sensing order. For any < o; < 1, set e such that 

,fc + 1 — a 1 

Then, as 2CA;'(1 - e)l^l> 2C{k + 1 - a), we have that 

Csense > 2C{k + 1 - a) 

Now set L = I — I and: 

2(2A; + 1 - a) 

To complete the proof, it is sufficient to prove the following claim: 
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Claim 5. The optimal action at s is to traverse {s,t) if and only if G does 
not have a vertex cover of size k. 

Proof. We show the following: 

• If there is no cover in size A;, the optimal policy is to traverse (s,t). 

• Otherwise, the optimal policy is as follows: visit the "vertex" nodes 
constituting the cover, doing the appropriate sensing actions; if the 
sensing path is unblocked, visit u in order to sense {x,t), and then 
take the path from s to t through x if {x, t) is found traversable. (In 
this case, given the optimal policy, one can straightforwardly construct 
the vertex cover of size k.) 

Note that it is suboptimal to try the sensing path unless assured that all 
edges leading to u are traversable, because then there is at least one edge 
that can be blocked with probability e, in which case attempting this path 
results in no positive gain. To see this, note that the traversal cost 2L must 
be paid anyway, thus the total expected gain from trying the sensing path 
is: 



^ < (1 - e)VOI{x, - 2L = 1 - e - 2L = 1 - 6 - 2(^ - ^) = -| < 

This also holds for all policies that attempt some sensing actions before 
trying the sensing path, but that do not make sure that u is reachable before 
trying the sensing path. 

Now, if there is a vertex cover of size k, the expected cost of sensing 
all edges in the sensing path is at most 2Ck. If u is found to be reachable 
(prior probability (1 — e)'^' of that happening) use the sensing path (which 
costs 2L) to sense (x, i), gaining the expected VOI{x,t) of 1. The total 
expected gain in this case is positive: 

g' > {l-ey''\VOI{x,t)-2L)-2Ck = (l-e)l^l(l-(l-^))-2A;; ~ 



2'' 2{2k + l-a) 

Finally, if there is no vertex cover of size k, then the expected gain for 
sensing in the best case is (assuming the policy of sensing all edges in the 
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sensing path from only k + 1 "vertex" nodes, costing at least 2C{k + 1 — a) 
in expectation) is at most: 



g" < (l-e)l^l(rOJ(x,t)-2L)-2C(fc + l-a) = 

(1 _ - (1 - '-)) _ 2(fc + 1 - «); ~ 



2'' ' '2(2A; + l-a) 

and thus the optimal pohcy here is not to perform sensing at all, but to 
traverse (s, t) immediately. □ 



5. Discussion 



Having shown that stochastic CTP is PSPACE-hard, several related 
questions on variants of CTP and CTP with restricted topologies arise. One 
issue of particular interest is the question of efficiently finding approximately 
optimal actions. The proofs in this paper make use of rather small gaps 
between expected values of two candidate actions, and thus leave open the 
possibility of efficient approximation algorithms. 

Studies of the competitive analysis of the Canadian Traveler Problem 
reveal rudimentary bounds on approximability. Denoting by k the num- 
ber of uncertain edges in an instance, there exists for the undirected case 
polynomial-time algorithms achieving competitive ratios of 2A; + 1 [6J. As a 
consequence, stochastic CTP can be approximated within 2k + 1. With a 
slightly improved analysis, the same algorithm yields a 2n+l-approximation. 
In the directed case, existing results from competitive analysis only yield 
approximations of 2^~^^ + 1 and 2""'"^ + 1, respectively [5j. 

These approximation algorithms forego entirely the stochastic nature of 
the problem and leave open considerable improvements. At the time of this 
writing, no notable hardness of approximation results are known. 

Another issue is: what is the most general graph topology under which 
CTP is tractable or easy to approximate? An efficient algorithm was shown 
[9] for disjoint-path graphs, based on a lemma that there exists an optimal 
policy that is committing: (such a policy never returns to the source vertex 
unless its 'current' path to the target vertex is known to be blocked). But 
any departure from the disjoint-path structure (such as adding even one 
more edge that crosses between vertices in two of the paths) complicates 
things considerably by voiding the optimality of committing policies. 
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The st-reliability problem (finding the probability that an unblocked 
path from s to t exists) appears to be an essential building block in solving 
CTP. The reduction in QJ shows that CTP is intractable for almost any 
graph topology for which st-reliability is intractable as well. An open re- 
search question is whether CTP is tractable or easy to approximate for graph 
topologies in which si-reliability is tractable, such as "tree-structured" graphs, 
or the more general series-parallel graphs. 

Sensing-CTP is generally harder than CTP for restricted topologies, 
as shown in Section |4} Other variants of CTP, such as CTP-Dep, are also 
harder than CTP, as dependencies can act like remote sensing. It was shown 
that CTP-Dep is NP-hard for disjoint path graphs [I3]. However, when 
considering topological restrictions, one must also consider the topology of 
the dependency-graph, and the hardness proof in [13j used an essentially 
unrestricted topology Bayes network to represent the dependencies. 
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Appendix A. 

Appendix A.l. Baiting Gadgets 

Let g = BG{u,v) be a baiting gadget witli a parameter L > 1, defined 



in Section |3^ (see Fig. |2]). Recall that vr (as defined in Section 3.1) is the 
following policy for g: At u, traverse {u,Vi). At Vi, for any i < N, do as 
follows: if {vi,t) is unblocked, reach the destination through {vi,t) for cost of 
zero. However, if{vi,t) is blocked, traverse (fj,fi+i) for a cost of L / [N + 1) . 
AtvN, if {vN,t) is blocked, traverse {vn,v). 

By construction of the CTP-graph, we assume that any policy other 



than traversing {u,v) results in a cost of at least 1 (see Section 3.1). 



Apart from vr, other policies at u that are not clearly suboptimal are: 

• choose not to traverse {u,vi). 

• The following type of policies denoted by ttj, for j < N : execute ir 
until reaching vj; if {vj,tj) is unblocked, reach the destination through 
{vj,tj); otherwise, retreat to u and execute an optimal policy with an 
expected cost of Mj > 1. 

Finally, we set = 2^^"92m^ _ implying + 1 > 4L. 

Claim 2. When at u for the first time, tt is optimal for a baiting gadget 
BG{u,v) with a parameter L > 1. After reaching v, it is suboptimal to 
backtrack to u. 



Proof. Denote by K the expected cost of the optimal policy executed once 
V is reached. As there is an L cost shortcut edge (f , t), it is clear that K < L 
therefore it is always suboptimal to retrace g once v is reached. We first 
show that G{tt) < 1, hence choosing not to traverse {u,vi) is suboptimal. 

Note that for every i < N, the probability that (fj,fj+i) is traversed in 
71 is {^y. Hence we have 
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L 



N 



^ 1 1 

i=0 



N 



K 



Thus 



(A.l) 



C(7r) 



2L 



-(l-2-(^+i)) + 2-^ir 



1 

Then, as < L, + 1 > 4L and L > 1, we have that 



(A.2) 



^ ' AL 4 



As required. 



(A.3) 



Next we show that for every j < N, C(7r) < C^nj), hence for every 
j < A^, the pohcy vTj is suboptimal. We have that 



iV + 1 

i=0 



■2' + 1 



(A.4) 



Thus 



2^ -(1 - 2-') + + 2-'M, 

^ ^ N + 1 ^ 



(A.5) 



A^ + 1 

As K < L, and 1 < Mj, it is sufficient from (A.2) to show that for every 
0<j<N 



— {I- 2-(^+i)) + 2-^L < -J^il - 2-^) 



N + 1 
Therefore 



A^ + 1 



A^ + 1 



+ 2' 



(A.6) 



2L 



.{2-i - 2-(^+i) - 2-^-ij) + 2-^L < 2- 



N + 1 

Hence we need to show that for every < j < A^, 



N + V 2' 



2^'^L < 1 



As A^ + 1 > 4L and L > 1, it is sufficient to show that for every 
0<j<N, 
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l(2-j-) + 2^-^L<l (A.7) 



And inequality (A.7) follows since the function 

/(x) = ^(2-a;) + 2^-^L 
over the reals, has only one extremum, /(O) < 1 ,f{N) < 1 and 

lim f{x) = lim f{x) = oo 

x—^oo x—^—oo 

Appendix A. 2. Observation Gadgets 



Let g = OG{u,v, o) be an observation gadget as defined in Section 3.2 
and seen in Fig. [3] Recall that L > 8, and Li = 5L/8. ng is the following 
partial policy for OG(m, w, o): Atu, cross BGi (observe {vi,V4)). Then cross 
BG2. If either (t>i, v^), or (t>2, V3) is found blocked, reach t by traversing the 
3L/2 cost shortcut edge {v2,t). However, if both (fi,t>4), and (^2,^3) are 
unblocked, traverse {v2,vs,o,V4,vi,v[) (at 0, observe the edges incident on 
0, in case there are any), and cross BG3. 

We again assume, by construction of the CTP-graph, that any policy at 
u other than crossing BGi results in a cost of at least 1. 



Claim 3. When at u for the first time, iTg is an optimal policy for g. 

Proof. At u, as BGi is a baiting gadget, then by Claim [2| it is optimal 
to cross BGi. When first arriving at vi, after BGi is crossed, (^1,^4) is 
observed. As (0,^4) has a cost of Li > 1, and {vi,v[) has a cost of 1, then 
by Claim |2| it is optimal to cross BG2- Once at V2, if (^2,^3) is blocked, it 
is optimal to take the shortcut (f2,t) for a cost of 3L/2. 

It remains to show that if {v2,V3) is unblocked, the optimal policy at V2 

is: 

1. if {viyV^} is unblocked, traverse {v2,V3,o,V4,Vi,v[), and cross BG3. 

2. otherwise, traverse the shortcut {v2,t) for a cost of 3L/2. 

case 1: (^1,^4) is unblocked. 

First note that arriving at Vi a second time through (^4,^1), BGi and 
BG2 are known not to have any blocked shortcut edges, hence by Claim [2| 
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the optimal policy when arriving at vi a second time is to traverse {vi,v'i) 
and the baiting gadget BG3. 

Now, traversing {v2, V3, o, V4, Vi, v[), and crossing BG3 bears an expected 
cost of at most 2Li + 2, while traversing (^2,^) costs 3L/2. Hence, as 
2Li + 2 < 3L/2, it is optimal at V2 to traverse {v2, fs, o). 

We now inspect the possible partial policies at o: 

case l.a: Traverse {o,V4,vi,v'i), and cross BG3 for an expected cost of at 
most Li + 2. We denote this partial pohcy by tt'. 

case l.b: Traverse edges of another observation gadget g, in case there 
exists such g incident on o. Suppose that g has not already been tra- 
versed (label the vertices of as Vi). Then traversing either (o, Vs) and 
trying to traverse {v3,V2), or traversing (0,^4) and trying to traverse 
(^4,^1) results in an expected cost of at least Li + 3Li/4. Hence, 
as Li + 2 < Li + 3Li/4, we have that executing tt' is cheaper than 
traversing any edges of g. 

Next suppose that g has already been traversed. Therefore we may 
assume that the policy n-g was executed in u, thus the baiting gadgets 
of g are known not to contain any unblocked zero-cost shortcuts, hence 
crossing each such baiting gadget costs L. Then traversing 'g results 
in an expected cost of at least Li + L, and as Li + 2 < Li + L, we 
again have that executing tt' is cheaper than traversing any edges of 
9- 

case l.c: Traverse the exam section path. Recall that o is identified with 
rs. Suppose that observation edge {r^, r^) is blocked. At o, denote the 
following partial policy by tti: cross {r5,r[); if {r[,r2) is unblocked, 
continue with any optimal policy, otherwise, return to o, and execute 
tt', which can still be executed, for an expected cost of C{t:') < Li-\-2. 
Then we have 

C(7ri)>l+pi(l + C(7r')) 
and as pi > 1 — 2/(3L -|- 1), we have that 

C(7r') < 1 +pi(l + C(7r')) < C(7ri) 

Therefore executing tt' is cheaper than executing tti. 
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Now suppose that (r4, r^) is unblocked. Then we can either execute tti 
(or the symmetric case in which {r2,r^) is being inspected) with the 
same analysis, or we can extend tti with the following policy denoted 
by 7i2: 

execute tti; upon returning to o (after {r[,r'2) is found blocked), cross 
(r5,r4) and (r4,r3); if {r3,r2) is unblocked, continue with any opti- 
mal policy; otherwise return to r^, and execute tt', which can still be 
executed for an expected cost of C{t:') < Li + 2. Then we have that 

C(7r^) > l + 2pi+p2(i + C(7r')) 

However, pi > 1 - 2/(3L + 1) entails C(7r') < C(7ry. Therefore 
executing tt' is cheaper than executing as well. The policy in which 

(r2,r3) is the first edge among {r2^r^) and {r'i^r'2) to be inspected is 
symmetric to ■n2- Hence we see that traversing any edges of the exam 
section path is suboptimal. 

case 2: (fi,f4) is blocked. 

In this case the following partial policies can be executed at V2: 

case 2. a: Take the shortcut edge (^2,^) for a cost of 3L/2. Denote this 
policy by tt'. 

case 2.b: Traverse {v2, v^) and (^3, 0) for a cost of Li and at traverse edges 
of another observation gadget. As in case l.b we have that traversing 
edges of Tj results in an expected cost of at least Li + 3Li/4. Then, 
as 3L/2 < Li + Li + 3Li/4, wc have that tt' is cheaper than reaching 
o and traversing any edges of ]]. 

case 2.c: Traverse (v2^v^) and (^3,0), for cost of Li, and at o traverse the 
exam path. We define tti and 112 as in case I.e. Recall that C{ti') ~ 
3L/2. However as pi > 1 - 2/(3L + 1), we have that C{ti') < Li + 
C(7ri), and C(7r') < Li + C{'K2). Hence tt' is cheaper than traversing 
to o and traversing the any edges of the exam section path. 

□ 
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Appendix B. 



Appendix B.l. Behavior of reasonable policies 

Claim 4. At r^, any reasonable policy acts as follows. If all the edges in 
the exam section were observed to be unblocked, cross 
(ro, r\, - ■ ■ , r'^^^, t) until reaching t for a cost of 2(m + 1) . Otherwise, cross 
the cost L shortcut edge (ro,t). 

Proof. Retracing BG{zo, zi) clearly results in a cost of at least L. We 
first see that unless all the edges in the exam section were observed to be 
unblocked, any partial policy executed at tq results in a cost of at least L, 
therefore it is cheaper to take the shortcut (ro,t) for a cost of L. 

At every vertex r', / < m + 1, , i < 5, any unblocked edge on the exam 
section path, incident on r', can be traversed. At r2 there is an additional 
option to cross either BG{zi, zi+i), or BG{zi_i, zi) which hold no unblocked 
shortcut edges, hence crossing these results in a cost of at least L. If 
is identified with an observation point of some observation gadget g' , there 
is an additional option to traverse edges of g. However by an argument 
identical to case l.b of [Appendix A. 2 , traversing any edges of g results in a 



cost of at least L. Hence any deviation from the exam section path results 
in a cost of at least L. 

Suppose that all the edges of the exam section are known to be un- 
blocked. Then, as the exam section contains 2(m + 1) always traversable 
cost 1 edges, and as 2{m + 1) < L, the optimal policy is to cross the exam 
section (rg, r\, - ■ ■ ,t) for the cost of 2(m + 1). 

Otherwise, suppose there are edges in the exam section with unknown 
status. As all the guard edges were observed upon crossing the guard sec- 
tion,then such an edge is a clause edge. Hence let e/ = {r[,rl) be the 
first unknown clause edge such that every edge in the path (ro, ■ ■ ■ ,r[) 
is known to be unblocked. Finding (r4,r5) blocked results in either re- 
tracing the exam section to tq and taking the cost L shortcut to t, or in 
deviating from the exam section. Hence, as {r'^,rlj) costs 1, L > 1 and 
pi > 1 — 2/(3L + 1), traversing from tq to e/ results in an expected cost 
of at least 1 + pi{l + L) > L. Hence traversing the shortcut edge {v2,t) is 
cheaper. Obviously, the same argument holds for traversing ci where ci is 
previously known to be blocked. □ 

Appendix B.2. Polynomial size representation 

We show the computation of h, the cost of the default edge (s, t). Recall 
that L = 8m + 16, A = 2ri°g2(4L)i _ i and pi = 1 - 2" V°fi2{^)\ _ 
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From Section 13.41 we have that 



and 



1 n 

h = Bo + {-)^mPr, 



Bo = Dst + BroiDpt + Prt2(m + 1) + (1 - Prt)L) 



where for a reasonable pohcy vr, is the probabihty of reaching tq by 
executing vr, Dpt is the total cost from s to ro while executing vr in a full-trip 
weather, Dgt is the expected cost of executing vr over shortcut weathers, and 
Prt is the probability that all the edges in the exam section are unblocked. 



From section 3.4, = (1 — pi)'^"^^'' and 



19mL + 4 

Dpt = l + {2 + )n + (n + m + 1)L 

Hence it is left to compute Pr^ and Dst- To do that we define the 
following. 

For A; > 0, let G{k) be a CTP instance composed of a series of gadgets Qi, 
1 < i < k, such that for every i < k, Exit{gi) is identified with Entry (gi^i). 
The gadgets in G{k) are either all baiting gadget with a parameter L > 1 
(then G{k) is denoted by BG{L,k)), or are all observation gadgets (then 
G{k) is denoted by OG{k)). Set s to be Entry{gi). Then we denote the 
following policy for G{k) as tta,-: For every i < k, cross Qi. 

Let q{G{k)) be the probability that Exit{gk) is reached by executing tt^. 
Let Wi{G{k)) be the expected traversal cost when Exit{gk) is reached while 
executing Tik- Let W2{G{k)) be the expected cost in case a shortcut to t is 
taken while executing vr^. Then we have for /c > 1 



«;2(G(A;)) = «;2(G(1)) + g(G(l))(t/7i(G(l)) + ^i;2(G(A; - 1))) (B.l) 



Next, if G{k) is a series of baiting gadgets we have that q{BG{L, k)) 
2-^^ and Wi{BG{L, k)) = kL. From (|aI2|) we obtain: 



W2iBG{L,l)) 



2L 



N + 1 

If G{k) is a series of observation gadgets, set 



(B.2) 
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Then we have qiOG{k)) = 2-'=(2^+^i+4)^ and Wi{OG{k)) = Ml?|±l). We 
compute W2{0G{1)) to be: 



^2(0^(1)) = W2{BG{L, 1)) + 2-^{wi{BG{L, 1)) 

+2-^(^2(SG(^, l))+2-^-^^(^)+2-^-^-^(2Li+l+^2(SG(L, 1))))) 

(B.3) 

We can now compute P,.„. Assume w.l.o.g. n is even. Due to symmetry 
of the true/false-paths, and the subsections of the variable sections, we have 
that 

Pr, = (^q{BG{L, l))q{OG{L, m))^q{BG{L, l))g(OG(L, m))^ ' q{BG{L, m+2)) 

(B.4) 

To find Dst we note the following. Assume w.l.o.g n is even. Due to 
symmetric considerations, there are parameters qgt, Wgt and Zst, independent 
of i, such that by executing tt at f^, f reached with probability g^t and 
an expected cost of Wst-, and v[j^2 reached (i.e. a shortcut it taken) 

with an expected cost of Zgt- Assume w.l.o.g. the subsection Xi is universal. 
Then 

qst = q{BG{L, l))^g(OG(L, m))g(i?G(L, l))g(OG(L, m)) (B.5) 

and 

Wst = 2L + 4: + 2mwi{OG{L,l)) (B.6) 

We now compute 

Zst = W2{BG{L, 1)) + 2-^(^/;i(SG'(L, 1)) + 11 + ^(1 + W2{,0G{L, m))+ 

q{OGm m)){wi{OG{L, m)) + 1 + W2{BG{L, 1))+ 

2-^(«;i(SG(L, 1)) + 1 + W2{OG{L, m)) + ^(©^(L, m)) + 1)))) (B.7) 
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Now define Zst — Zst + Qst^st, set D]^ = Zst, and for every k > 1 set 

D'st = Dl, + QstD'^i' (B 

Tlien 

D,t = + {qst)^W2{BG{L,m + 1)) (B 
which concludes the computation of h. 



28 



